Using deep features to create an image classifier



In [1]:

    
import graphlab

Load CIFAR-10 dataset

This is a popular computer vision dataset used for benchmarking. It is already split into a training and testing set.



In [2]:

    
image_train_url = 'https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv'
image_test_url = 'https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv'



In [3]:

    
image_train_data = graphlab.SFrame(image_train_url)
image_train_data.head()









    



This non-commercial license of GraphLab Create for academic use is assigned to william_gray@alumni.brown.edu and will expire on March 20, 2018.






    



[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: /tmp/graphlab_server_1491973526.log






    




Downloading https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv to /var/tmp/graphlab-williamgray1/19252/8625f2a8-f20a-4077-9db1-58a605fa78f4.csv






    




Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv






    




Parsing completed. Parsed 100 lines in 1.18667 secs.






    



------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,str,array,array]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------






    




Read 1943 lines. Lines per second: 803.644






    




Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_train_data.csv






    




Parsing completed. Parsed 2005 lines in 2.46638 secs.






    Out[3]:





    
        id
        image
        label
        deep_features
        image_array
    
    
        24
        Height: 32 Width: 32
        bird
        [0.242872, 1.09545, 0.0,
0.39363, 0.0, 0.0, ...
        [73.0, 77.0, 58.0, 71.0,
68.0, 50.0, 77.0, 69.0, ...
    
    
        33
        Height: 32 Width: 32
        cat
        [0.525088, 0.0, 0.0, 0.0,
0.0, 0.0, 9.94829, 0.0, ...
        [7.0, 5.0, 8.0, 7.0, 5.0,
8.0, 5.0, 4.0, 6.0, 7.0, ...
    
    
        36
        Height: 32 Width: 32
        cat
        [0.566016, 0.0, 0.0, 0.0,
0.0, 0.0, 9.9972, 0.0, ...
        [169.0, 122.0, 65.0,
131.0, 108.0, 75.0, ...
    
    
        70
        Height: 32 Width: 32
        dog
        [1.1298, 0.0, 0.0,
0.778194, 0.0, 0.758051, ...
        [154.0, 179.0, 152.0,
159.0, 183.0, 157.0, ...
    
    
        90
        Height: 32 Width: 32
        bird
        [1.71787, 0.0, 0.0, 0.0,
0.0, 0.0, 9.33936, 0.0, ...
        [216.0, 195.0, 180.0,
201.0, 178.0, 160.0, ...
    
    
        97
        Height: 32 Width: 32
        automobile
        [1.57819, 0.0, 0.0, 0.0,
0.0, 0.0, 9.00632, 0.0, ...
        [33.0, 44.0, 27.0, 29.0,
44.0, 31.0, 32.0, 45.0, ...
    
    
        107
        Height: 32 Width: 32
        dog
        [0.0, 0.0, 0.220678, 0.0,
0.0, 0.0, 8.58053, ...
        [97.0, 51.0, 31.0, 104.0,
58.0, 38.0, 107.0, 61.0, ...
    
    
        121
        Height: 32 Width: 32
        bird
        [0.0, 0.237535, 0.0, 0.0,
0.0, 0.0, 9.9908, 0.0, ...
        [93.0, 96.0, 88.0, 102.0,
106.0, 97.0, 117.0, ...
    
    
        136
        Height: 32 Width: 32
        automobile
        [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 7.57379, 0.0, 0.0, ...
        [35.0, 59.0, 53.0, 36.0,
56.0, 56.0, 42.0, 62.0, ...
    
    
        138
        Height: 32 Width: 32
        bird
        [0.658936, 0.0, 0.0, 0.0,
0.0, 0.0, 9.93748, 0.0, ...
        [205.0, 193.0, 195.0,
200.0, 187.0, 193.0, ...
    

[10 rows x 5 columns]



In [4]:

    
image_test_data = graphlab.SFrame(image_test_url)
image_test_data.head()









    




Downloading https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv to /var/tmp/graphlab-williamgray1/19252/0360416d-e3ff-46d0-8786-03b9b1658260.csv






    




Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv






    




Parsing completed. Parsed 100 lines in 1.20322 secs.






    



------------------------------------------------------
Inferred types from first 100 line(s) of file as 
column_type_hints=[int,str,str,array,array]
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
------------------------------------------------------






    




Read 1940 lines. Lines per second: 746.417






    




Finished parsing file https://d396qusza40orc.cloudfront.net/phoenixassets/image_test_data.csv






    




Parsing completed. Parsed 4000 lines in 4.11863 secs.






    Out[4]:





    
        id
        image
        label
        deep_features
        image_array
    
    
        0
        Height: 32 Width: 32
        cat
        [1.13469, 0.0, 0.0, 0.0,
0.0366498, 0.0, 9.3536, ...
        [158.0, 112.0, 49.0,
159.0, 111.0, 47.0, ...
    
    
        6
        Height: 32 Width: 32
        automobile
        [0.231359, 0.0, 0.0, 0.0,
0.0, 0.226023, 8.85989, ...
        [160.0, 37.0, 13.0,
185.0, 49.0, 11.0, 20 ...
    
    
        8
        Height: 32 Width: 32
        cat
        [0.0, 0.0, 0.0344192,
0.0, 0.0, 0.0, 11.0375, ...
        [23.0, 19.0, 23.0, 19.0,
21.0, 28.0, 21.0, 16.0, ...
    
    
        9
        Height: 32 Width: 32
        automobile
        [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 11.6065, 0.0, 0.0, ...
        [217.0, 215.0, 209.0,
210.0, 208.0, 202.0, ...
    
    
        12
        Height: 32 Width: 32
        dog
        [0.322317, 0.0, 1.24933,
0.0, 0.0, 0.0, 9.10822, ...
        [91.0, 64.0, 30.0, 82.0,
58.0, 30.0, 87.0, 73.0, ...
    
    
        16
        Height: 32 Width: 32
        dog
        [0.0, 0.0, 0.347357, 0.0,
0.0, 0.0, 9.98674, 0.0, ...
        [95.0, 76.0, 78.0, 92.0,
77.0, 78.0, 89.0, 77.0, ...
    
    
        24
        Height: 32 Width: 32
        dog
        [1.31558, 0.0, 0.0, 0.0,
0.0, 0.0, 8.71812, 0.0, ...
        [136.0, 134.0, 118.0,
142.0, 141.0, 126.0, ...
    
    
        25
        Height: 32 Width: 32
        bird
        [0.0, 0.317289, 0.0,
1.36553, 0.54447, 0.0, ...
        [100.0, 103.0, 74.0,
68.0, 91.0, 65.0, 116.0, ...
    
    
        31
        Height: 32 Width: 32
        dog
        [0.0, 0.0, 0.0, 0.0, 0.0,
0.0, 9.26019, 0.0, 0.0, ...
        [127.0, 130.0, 81.0,
130.0, 133.0, 88.0, ...
    
    
        33
        Height: 32 Width: 32
        dog
        [0.130787, 0.727667, 0.0,
0.0, 0.0, 0.0, 10.1179, ...
        [118.0, 113.0, 81.0,
122.0, 117.0, 83.0, ...
    

[10 rows x 5 columns]

Train classifier using raw image pixels, no deep features yet

This will be compared against the model that uses deep features later on. Goal is to predict the image's label using other features.



In [5]:

    
raw_pixel_model = graphlab.logistic_classifier.create(image_train_data, target='label',
                                                     features=['image_array'])









    



PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.







    




WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.






    




Logistic regression:






    




--------------------------------------------------------






    




Number of examples          : 1886






    




Number of classes           : 4






    




Number of feature columns   : 1






    




Number of unpacked features : 3072






    




Number of coefficients    : 9219






    




Starting L-BFGS






    




--------------------------------------------------------






    




+-----------+----------+-----------+--------------+-------------------+---------------------+






    




| Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |






    




+-----------+----------+-----------+--------------+-------------------+---------------------+






    




| 1         | 6        | 0.000015  | 3.697588     | 0.355779          | 0.344538            |






    




| 2         | 8        | 1.000000  | 4.804952     | 0.386002          | 0.378151            |






    




| 3         | 9        | 1.000000  | 5.503424     | 0.429480          | 0.462185            |






    




| 4         | 10       | 1.000000  | 6.161361     | 0.442736          | 0.478992            |






    




| 5         | 11       | 1.000000  | 6.819093     | 0.449629          | 0.478992            |






    




| 6         | 12       | 1.000000  | 7.501410     | 0.433722          | 0.428571            |






    




| 10        | 17       | 1.000000  | 10.485146    | 0.507423          | 0.512605            |






    




+-----------+----------+-----------+--------------+-------------------+---------------------+






    




TERMINATED: Iteration limit reached.






    




This model may not be optimal. To improve it, consider increasing `max_iterations`.

Predict five images with this raw pixel model



In [6]:

    
# actual image labels (correct answers)
image_test_data[0:5]['label']









    Out[6]:





dtype: str
Rows: 5
['cat', 'automobile', 'cat', 'automobile', 'dog']



In [7]:

    
# model output
raw_pixel_model.predict(image_test_data[0:5])









    Out[7]:





dtype: str
Rows: 5
['bird', 'cat', 'bird', 'automobile', 'dog']

Raw pixel model only got one out of five predictions correct. That's an F.

More general evaluation of the raw pixel model



In [8]:

    
raw_pixel_model.evaluate(image_test_data)









    Out[8]:





{'accuracy': 0.47625, 'auc': 0.7203336249999999, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |     dog      |       cat       |  147  |
 |     cat      |       dog       |  400  |
 |     dog      |       dog       |  522  |
 |     bird     |    automobile   |   97  |
 |  automobile  |    automobile   |  609  |
 |     bird     |       cat       |   93  |
 |     bird     |       dog       |  266  |
 |  automobile  |       bird      |  125  |
 |     bird     |       bird      |  544  |
 |  automobile  |       cat       |   99  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.4688285324983248, 'log_loss': 1.220651965245015, 'precision': 0.48034554969626087, 'recall': 0.47625, 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 	class	int
 
 Rows: 400004
 
 Data:
 +-----------+-----+-----+------+------+-------+
 | threshold | fpr | tpr |  p   |  n   | class |
 +-----------+-----+-----+------+------+-------+
 |    0.0    | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 1.0 | 1.0 | 1000 | 3000 |   0   |
 +-----------+-----+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

The accuracy of this model is only 47.6%.

Next, a model leveraging deep features

This model will use transfer learning, since the dataset is so small. It will use the ImageNet dataset's deep features training, along with a simple classifier.

Computing deep features for the images

The two lines below will computer deep features. However, this is computationally intensive, so the deep features are already contained in the dataset for this reason. If they were not, I would run the two lines below.



In [9]:

    
# deep_learning_model = graphlab.load_model('http://s3.amazonaws.com/GraphLab-Datasets/deeplearning/imagenet_model_iter45')
# image_train_data['deep_features'] = deep_learning_model.extract_features(image_train_data)

Train a classifier using the deep features



In [10]:

    
deep_features_model = graphlab.logistic_classifier.create(image_train_data,
                                                         features=['deep_features'],
                                                         target='label')









    



PROGRESS: Creating a validation set from 5 percent of training data. This may take a while.
          You can set ``validation_set=None`` to disable validation tracking.







    




WARNING: The number of feature dimensions in this problem is very large in comparison with the number of examples. Unless an appropriate regularization value is set, this model may not provide accurate predictions for a validation/test set.






    




WARNING: Detected extremely low variance for feature(s) 'deep_features' because all entries are nearly the same.
Proceeding with model training using all features. If the model does not provide results of adequate quality, exclude the above mentioned feature(s) from the input dataset.






    




Logistic regression:






    




--------------------------------------------------------






    




Number of examples          : 1912






    




Number of classes           : 4






    




Number of feature columns   : 1






    




Number of unpacked features : 4096






    




Number of coefficients    : 12291






    




Starting L-BFGS






    




--------------------------------------------------------






    




+-----------+----------+-----------+--------------+-------------------+---------------------+






    




| Iteration | Passes   | Step size | Elapsed Time | Training-accuracy | Validation-accuracy |






    




+-----------+----------+-----------+--------------+-------------------+---------------------+






    




| 1         | 5        | 0.000131  | 3.364257     | 0.734310          | 0.752688            |






    




| 2         | 9        | 0.250000  | 6.298938     | 0.759414          | 0.774194            |






    




| 3         | 10       | 0.250000  | 7.280189     | 0.764121          | 0.763441            |






    




| 4         | 11       | 0.250000  | 8.243664     | 0.771967          | 0.774194            |






    




| 5         | 12       | 0.250000  | 9.221105     | 0.775628          | 0.784946            |






    




| 6         | 13       | 0.250000  | 10.221225    | 0.783473          | 0.784946            |






    




| 7         | 14       | 0.250000  | 11.168960    | 0.795502          | 0.774194            |






    




| 8         | 15       | 0.250000  | 12.209590    | 0.814854          | 0.774194            |






    




| 9         | 16       | 0.250000  | 13.169402    | 0.843619          | 0.795699            |






    




| 10        | 17       | 0.250000  | 14.110809    | 0.850941          | 0.784946            |






    




+-----------+----------+-----------+--------------+-------------------+---------------------+






    




TERMINATED: Iteration limit reached.






    




This model may not be optimal. To improve it, consider increasing `max_iterations`.

Try predicting the first five images again



In [11]:

    
# actual image labels (correct answers)
image_test_data[0:5]['label']









    Out[11]:





dtype: str
Rows: 5
['cat', 'automobile', 'cat', 'automobile', 'dog']



In [12]:

    
# model output
deep_features_model.predict(image_test_data[0:5])









    Out[12]:





dtype: str
Rows: 5
['cat', 'automobile', 'cat', 'automobile', 'dog']

It got them all correct! A+.

More general evaluation of the model, similar to evalution of the raw pixels model



In [13]:

    
deep_features_model.evaluate(image_test_data)









    Out[13]:





{'accuracy': 0.78025, 'auc': 0.937662249999998, 'confusion_matrix': Columns:
 	target_label	str
 	predicted_label	str
 	count	int
 
 Rows: 16
 
 Data:
 +--------------+-----------------+-------+
 | target_label | predicted_label | count |
 +--------------+-----------------+-------+
 |  automobile  |       cat       |   11  |
 |     dog      |       cat       |  211  |
 |  automobile  |       dog       |   5   |
 |     cat      |       bird      |   90  |
 |     bird     |       dog       |   51  |
 |     dog      |       bird      |   58  |
 |     cat      |    automobile   |   51  |
 |     bird     |       cat       |  112  |
 |     dog      |    automobile   |   21  |
 |     dog      |       dog       |  710  |
 +--------------+-----------------+-------+
 [16 rows x 3 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns., 'f1_score': 0.7788080106173169, 'log_loss': 0.5723048751210497, 'precision': 0.777826311332351, 'recall': 0.78025, 'roc_curve': Columns:
 	threshold	float
 	fpr	float
 	tpr	float
 	p	int
 	n	int
 	class	int
 
 Rows: 400004
 
 Data:
 +-----------+----------------+-----+------+------+-------+
 | threshold |      fpr       | tpr |  p   |  n   | class |
 +-----------+----------------+-----+------+------+-------+
 |    0.0    |      1.0       | 1.0 | 1000 | 3000 |   0   |
 |   1e-05   |     0.981      | 1.0 | 1000 | 3000 |   0   |
 |   2e-05   | 0.976333333333 | 1.0 | 1000 | 3000 |   0   |
 |   3e-05   |     0.974      | 1.0 | 1000 | 3000 |   0   |
 |   4e-05   | 0.971333333333 | 1.0 | 1000 | 3000 |   0   |
 |   5e-05   |     0.968      | 1.0 | 1000 | 3000 |   0   |
 |   6e-05   | 0.964666666667 | 1.0 | 1000 | 3000 |   0   |
 |   7e-05   | 0.961666666667 | 1.0 | 1000 | 3000 |   0   |
 |   8e-05   |      0.96      | 1.0 | 1000 | 3000 |   0   |
 |   9e-05   | 0.957333333333 | 1.0 | 1000 | 3000 |   0   |
 +-----------+----------------+-----+------+------+-------+
 [400004 rows x 6 columns]
 Note: Only the head of the SFrame is printed.
 You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.}

Accuracy is 79%!



In [ ]:

id	image	label	deep_features	image_array
24	Height: 32 Width: 32	bird	[0.242872, 1.09545, 0.0, 0.39363, 0.0, 0.0, ...	[73.0, 77.0, 58.0, 71.0, 68.0, 50.0, 77.0, 69.0, ...
33	Height: 32 Width: 32	cat	[0.525088, 0.0, 0.0, 0.0, 0.0, 0.0, 9.94829, 0.0, ...	[7.0, 5.0, 8.0, 7.0, 5.0, 8.0, 5.0, 4.0, 6.0, 7.0, ...
36	Height: 32 Width: 32	cat	[0.566016, 0.0, 0.0, 0.0, 0.0, 0.0, 9.9972, 0.0, ...	[169.0, 122.0, 65.0, 131.0, 108.0, 75.0, ...
70	Height: 32 Width: 32	dog	[1.1298, 0.0, 0.0, 0.778194, 0.0, 0.758051, ...	[154.0, 179.0, 152.0, 159.0, 183.0, 157.0, ...
90	Height: 32 Width: 32	bird	[1.71787, 0.0, 0.0, 0.0, 0.0, 0.0, 9.33936, 0.0, ...	[216.0, 195.0, 180.0, 201.0, 178.0, 160.0, ...
97	Height: 32 Width: 32	automobile	[1.57819, 0.0, 0.0, 0.0, 0.0, 0.0, 9.00632, 0.0, ...	[33.0, 44.0, 27.0, 29.0, 44.0, 31.0, 32.0, 45.0, ...
107	Height: 32 Width: 32	dog	[0.0, 0.0, 0.220678, 0.0, 0.0, 0.0, 8.58053, ...	[97.0, 51.0, 31.0, 104.0, 58.0, 38.0, 107.0, 61.0, ...
121	Height: 32 Width: 32	bird	[0.0, 0.237535, 0.0, 0.0, 0.0, 0.0, 9.9908, 0.0, ...	[93.0, 96.0, 88.0, 102.0, 106.0, 97.0, 117.0, ...
136	Height: 32 Width: 32	automobile	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.57379, 0.0, 0.0, ...	[35.0, 59.0, 53.0, 36.0, 56.0, 56.0, 42.0, 62.0, ...
138	Height: 32 Width: 32	bird	[0.658936, 0.0, 0.0, 0.0, 0.0, 0.0, 9.93748, 0.0, ...	[205.0, 193.0, 195.0, 200.0, 187.0, 193.0, ...

id	image	label	deep_features	image_array
0	Height: 32 Width: 32	cat	[1.13469, 0.0, 0.0, 0.0, 0.0366498, 0.0, 9.3536, ...	[158.0, 112.0, 49.0, 159.0, 111.0, 47.0, ...
6	Height: 32 Width: 32	automobile	[0.231359, 0.0, 0.0, 0.0, 0.0, 0.226023, 8.85989, ...	[160.0, 37.0, 13.0, 185.0, 49.0, 11.0, 20 ...
8	Height: 32 Width: 32	cat	[0.0, 0.0, 0.0344192, 0.0, 0.0, 0.0, 11.0375, ...	[23.0, 19.0, 23.0, 19.0, 21.0, 28.0, 21.0, 16.0, ...
9	Height: 32 Width: 32	automobile	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 11.6065, 0.0, 0.0, ...	[217.0, 215.0, 209.0, 210.0, 208.0, 202.0, ...
12	Height: 32 Width: 32	dog	[0.322317, 0.0, 1.24933, 0.0, 0.0, 0.0, 9.10822, ...	[91.0, 64.0, 30.0, 82.0, 58.0, 30.0, 87.0, 73.0, ...
16	Height: 32 Width: 32	dog	[0.0, 0.0, 0.347357, 0.0, 0.0, 0.0, 9.98674, 0.0, ...	[95.0, 76.0, 78.0, 92.0, 77.0, 78.0, 89.0, 77.0, ...
24	Height: 32 Width: 32	dog	[1.31558, 0.0, 0.0, 0.0, 0.0, 0.0, 8.71812, 0.0, ...	[136.0, 134.0, 118.0, 142.0, 141.0, 126.0, ...
25	Height: 32 Width: 32	bird	[0.0, 0.317289, 0.0, 1.36553, 0.54447, 0.0, ...	[100.0, 103.0, 74.0, 68.0, 91.0, 65.0, 116.0, ...
31	Height: 32 Width: 32	dog	[0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 9.26019, 0.0, 0.0, ...	[127.0, 130.0, 81.0, 130.0, 133.0, 88.0, ...
33	Height: 32 Width: 32	dog	[0.130787, 0.727667, 0.0, 0.0, 0.0, 0.0, 10.1179, ...	[118.0, 113.0, 81.0, 122.0, 117.0, 83.0, ...